This workbook presents the next steps in the basic work flow for scRNAseq analysis 1. Merging samples and batch correction 2. Clustering 3. Cluster annotation 4. Compare proportions of cell types
This workbook assumes two Seurat objects 1. adolescent and 2. adult brain samples. The Seurat objects are created and the data is cleaned in Notebook 2. Running
library(Seurat)
Registered S3 method overwritten by 'data.table':
method from
print.data.table
Registered S3 method overwritten by 'htmlwidgets':
method from
print.htmlwidget tools:rstudio
Attaching SeuratObject
library(tidyverse)
Registered S3 methods overwritten by 'dbplyr':
method from
print.tbl_lazy
print.tbl_sql
── Attaching packages ────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.4.2 ✔ purrr 1.0.1
✔ tibble 3.2.1 ✔ dplyr 1.1.1
✔ tidyr 1.3.0 ✔ stringr 1.5.0
✔ readr 2.1.3 ✔ forcats 0.5.2
── Conflicts ───────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
Merge data objects
merge_seurate <- merge(adolescent_data_seurat,adult_data_seurat)
Warning in CheckDuplicateCellNames(object.list = objects) :
Some cell names are duplicated across objects provided. Renaming to enforce unique cell names.
merge_seurate
An object of class Seurat
20116 features across 9044 samples within 1 assay
Active assay: RNA (20116 features, 0 variable features)
unique(merge_seurate$orig.ident)
[1] "Adolescent" "Adult"
Integrate objects using the Seurat function
sample.list <- SplitObject(merge_seurate, split.by = "orig.ident")
# We have already normalized and identified variable features in each sample
# If we had not done so we can normalize here
#for (i in 1:length(sample.list)){
# org.list[[i]] <- NormalizeData(org.list[[i]], verbose = FALSE)
#org.list[[i]] <- FindVariableFeatures(org.list[[i]], selection.method = "vst")
#}
# Now we find features that can act as anchors between the two samples
int.anchors <- FindIntegrationAnchors(object.list = sample.list, dims = 1:50)
Computing 2000 integration features
No variable features found for object1 in the object.list. Running FindVariableFeatures ...
Calculating gene variances
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
No variable features found for object2 in the object.list. Running FindVariableFeatures ...
Calculating gene variances
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Scaling features for provided objects
| | 0 % ~calculating
|+++++++++++++++++++++++++ | 50% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s
Finding all pairwise anchors
| | 0 % ~calculating
Running CCA
Merging objects
Finding neighborhoods
Finding anchors
Found 7587 anchors
Filtering anchors
Retained 4795 anchors
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01m 07s
integrated_seurate <- IntegrateData(anchorset = int.anchors, dims = 1:50)
Merging dataset 2 into 1
Extracting anchors for merged samples
Finding integration vectors
Finding integration vector weights
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Integrating data
Perform dimensional reduction PCA and UMAP on the merge and intergrated objects
# merge object
DefaultAssay(merge_seurate) <- "RNA"
merge_seurate <- ScaleData(merge_seurate, verbose = FALSE)
# in the merge data set we sill need features for the PCA input
merge_seurate <- FindVariableFeatures(merge_seurate, selection.method = "vst")
Calculating gene variances
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
merge_seurate <- RunPCA(merge_seurate, npcs = 30, verbose = FALSE)
merge_seurate <- RunUMAP(merge_seurate, reduction = "pca", dims = 1:30)
15:22:55 UMAP embedding parameters a = 0.9922 b = 1.112
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
15:22:55 Read 9044 rows and found 30 numeric columns
15:22:55 Using Annoy for neighbor search, n_neighbors = 30
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
15:22:55 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
15:22:56 Writing NN index file to temp file /var/folders/k4/khtkczkd5tn732ftjpwgtr240000gn/T//RtmpglCqdx/file13ea169e11381
15:22:56 Searching Annoy index using 1 thread, search_k = 3000
15:22:57 Annoy recall = 100%
15:22:59 Commencing smooth kNN distance calibration using 1 thread with target n_neighbors = 30
15:23:00 Initializing from normalized Laplacian + noise (using irlba)
15:23:00 Commencing optimization for 500 epochs, with 385934 positive edges
Using method 'umap'
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
15:23:15 Optimization finished
Repeat PCA and UMAP for the integrated object
Idents(integrated_seurate) <- "integrated"
integrated_seurate <- ScaleData(integrated_seurate, verbose = FALSE)
# in the integrated features will be the pca input
integrated_seurate <- RunPCA(integrated_seurate, npcs = 30, verbose = FALSE)
integrated_seurate <- RunUMAP(integrated_seurate, reduction = "pca", dims = 1:30)
15:28:19 UMAP embedding parameters a = 0.9922 b = 1.112
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
15:28:19 Read 9044 rows and found 30 numeric columns
15:28:19 Using Annoy for neighbor search, n_neighbors = 30
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
15:28:19 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
15:28:20 Writing NN index file to temp file /var/folders/k4/khtkczkd5tn732ftjpwgtr240000gn/T//RtmpglCqdx/file13ea137c7b075
15:28:20 Searching Annoy index using 1 thread, search_k = 3000
15:28:22 Annoy recall = 100%
15:28:22 Commencing smooth kNN distance calibration using 1 thread with target n_neighbors = 30
15:28:24 Initializing from normalized Laplacian + noise (using irlba)
15:28:24 Commencing optimization for 500 epochs, with 391270 positive edges
Using method 'umap'
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
15:28:39 Optimization finished
Visualize the UMAPs comparing the merged and integrated objects
#library(ggplot2)
p1 <- DimPlot(merge_seurate, group.by = "orig.ident") + ggtitle("Merge")
p2 <- DimPlot(integrated_seurate, group.by = "orig.ident") + ggtitle("Integrated")
p1
p2
We can see a large shift in the two populations when the data is integrated (harmonized).
Cluster the integrated data
# see the importance of the pca components
ElbowPlot(integrated_seurate, ndims=30)
integrated_seurate
An object of class Seurat
22116 features across 9044 samples within 2 assays
Active assay: integrated (2000 features, 2000 variable features)
1 other assay present: RNA
2 dimensional reductions calculated: pca, umap
Choose 15 PCs There are 9044 cells. A common rule of thumb for choosing k for nearest neighbours is using the square root of the number of cells = 95
# calculate the square root
sqrt(9044)
[1] 95.09995
# the K parameter changes the size of clusters by changing the starting nodes input into the Louvain network
integrated_seurate <- FindNeighbors(integrated_seurate, dims = 1:15, k.param = 95)
Computing nearest neighbor graph
Computing SNN
# the number of clusters is dependent on the resolution a number from 0-2.
# Higher values make more clusters
# we include
integrated_seurate <- FindClusters(integrated_seurate, resolution = c(0,0.05,0.25,0.4,0.5,0.6,1,1.5) )
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
Number of nodes: 9044
Number of edges: 1529080
Running Louvain algorithm...
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Maximum modularity in 10 random starts: 1.0000
Number of communities: 1
Elapsed time: 5 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
Number of nodes: 9044
Number of edges: 1529080
Running Louvain algorithm...
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Maximum modularity in 10 random starts: 0.9727
Number of communities: 3
Elapsed time: 4 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
Number of nodes: 9044
Number of edges: 1529080
Running Louvain algorithm...
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Maximum modularity in 10 random starts: 0.9067
Number of communities: 5
Elapsed time: 4 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
Number of nodes: 9044
Number of edges: 1529080
Running Louvain algorithm...
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Maximum modularity in 10 random starts: 0.8635
Number of communities: 8
Elapsed time: 3 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
Number of nodes: 9044
Number of edges: 1529080
Running Louvain algorithm...
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Maximum modularity in 10 random starts: 0.8466
Number of communities: 9
Elapsed time: 4 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
Number of nodes: 9044
Number of edges: 1529080
Running Louvain algorithm...
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Maximum modularity in 10 random starts: 0.8307
Number of communities: 9
Elapsed time: 4 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
Number of nodes: 9044
Number of edges: 1529080
Running Louvain algorithm...
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Maximum modularity in 10 random starts: 0.7757
Number of communities: 12
Elapsed time: 4 seconds
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
Number of nodes: 9044
Number of edges: 1529080
Running Louvain algorithm...
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Maximum modularity in 10 random starts: 0.7176
Number of communities: 15
Elapsed time: 5 seconds
# we can visualize which cells are grouped together at different resolutions using clustree
library(clustree)
Loading required package: ggraph
clustree(integrated_seurate, prefix = "integrated_snn_res.")
Visualize the UMAP of the different cluster resolutions
res <- c(0.05,0.25,0.4,0.5,0.6,1,1.5)
resolutions <- paste("integrated_snn_res.", res, sep="")
resolutions
[1] "integrated_snn_res.0.05" "integrated_snn_res.0.25" "integrated_snn_res.0.4"
[4] "integrated_snn_res.0.5" "integrated_snn_res.0.6" "integrated_snn_res.1"
[7] "integrated_snn_res.1.5"
for(r in resolutions){
print(DimPlot(integrated_seurate, group.by = r))
}
NA
NA
Now we choose a resolution to annotate. We are expecting microglia, astrocytes, oligodendrocytes and OPCs. Resolution 0.4 looks like a good amount of clusters.
Cluster annotation
# find cluster markers
# we need to select the level of clustering that we want to annotate
Idents(integrated_seurate) <- "integrated_snn_res.0.4"
ClusterMarkers <- FindAllMarkers(integrated_seurate, only.pos = TRUE)
Calculating cluster 0
| | 0 % ~calculating
|+ | 1 % ~05s
|++ | 3 % ~05s
|++ | 4 % ~05s
|+++ | 5 % ~05s
|++++ | 6 % ~05s
|++++ | 8 % ~05s
|+++++ | 9 % ~05s
|++++++ | 10% ~04s
|++++++ | 12% ~04s
|+++++++ | 13% ~04s
|++++++++ | 14% ~04s
|++++++++ | 15% ~04s
|+++++++++ | 17% ~04s
|+++++++++ | 18% ~04s
|++++++++++ | 19% ~04s
|+++++++++++ | 21% ~04s
|+++++++++++ | 22% ~04s
|++++++++++++ | 23% ~04s
|+++++++++++++ | 24% ~04s
|+++++++++++++ | 26% ~04s
|++++++++++++++ | 27% ~04s
|+++++++++++++++ | 28% ~04s
|+++++++++++++++ | 29% ~04s
|++++++++++++++++ | 31% ~04s
|+++++++++++++++++ | 32% ~03s
|+++++++++++++++++ | 33% ~03s
|++++++++++++++++++ | 35% ~03s
|++++++++++++++++++ | 36% ~03s
|+++++++++++++++++++ | 37% ~03s
|++++++++++++++++++++ | 38% ~03s
|++++++++++++++++++++ | 40% ~03s
|+++++++++++++++++++++ | 41% ~03s
|++++++++++++++++++++++ | 42% ~03s
|++++++++++++++++++++++ | 44% ~03s
|+++++++++++++++++++++++ | 45% ~03s
|++++++++++++++++++++++++ | 46% ~03s
|++++++++++++++++++++++++ | 47% ~03s
|+++++++++++++++++++++++++ | 49% ~03s
|+++++++++++++++++++++++++ | 50% ~03s
|++++++++++++++++++++++++++ | 51% ~03s
|+++++++++++++++++++++++++++ | 53% ~02s
|+++++++++++++++++++++++++++ | 54% ~02s
|++++++++++++++++++++++++++++ | 55% ~02s
|+++++++++++++++++++++++++++++ | 56% ~02s
|+++++++++++++++++++++++++++++ | 58% ~02s
|++++++++++++++++++++++++++++++ | 59% ~02s
|+++++++++++++++++++++++++++++++ | 60% ~02s
|+++++++++++++++++++++++++++++++ | 62% ~02s
|++++++++++++++++++++++++++++++++ | 63% ~02s
|+++++++++++++++++++++++++++++++++ | 64% ~02s
|+++++++++++++++++++++++++++++++++ | 65% ~02s
|++++++++++++++++++++++++++++++++++ | 67% ~02s
|++++++++++++++++++++++++++++++++++ | 68% ~02s
|+++++++++++++++++++++++++++++++++++ | 69% ~02s
|++++++++++++++++++++++++++++++++++++ | 71% ~02s
|++++++++++++++++++++++++++++++++++++ | 72% ~01s
|+++++++++++++++++++++++++++++++++++++ | 73% ~01s
|++++++++++++++++++++++++++++++++++++++ | 74% ~01s
|++++++++++++++++++++++++++++++++++++++ | 76% ~01s
|+++++++++++++++++++++++++++++++++++++++ | 77% ~01s
|++++++++++++++++++++++++++++++++++++++++ | 78% ~01s
|++++++++++++++++++++++++++++++++++++++++ | 79% ~01s
|+++++++++++++++++++++++++++++++++++++++++ | 81% ~01s
|++++++++++++++++++++++++++++++++++++++++++ | 82% ~01s
|++++++++++++++++++++++++++++++++++++++++++ | 83% ~01s
|+++++++++++++++++++++++++++++++++++++++++++ | 85% ~01s
|+++++++++++++++++++++++++++++++++++++++++++ | 86% ~01s
|++++++++++++++++++++++++++++++++++++++++++++ | 87% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++ | 88% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++ | 90% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++ | 91% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 92% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 94% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 96% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 97% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 99% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=06s
Calculating cluster 1
| | 0 % ~calculating
|+ | 1 % ~15s
|++ | 2 % ~15s
|++ | 3 % ~15s
|+++ | 4 % ~15s
|+++ | 5 % ~15s
|++++ | 6 % ~14s
|++++ | 7 % ~14s
|+++++ | 8 % ~14s
|+++++ | 9 % ~14s
|++++++ | 10% ~14s
|++++++ | 11% ~14s
|+++++++ | 12% ~13s
|+++++++ | 13% ~13s
|++++++++ | 14% ~13s
|++++++++ | 15% ~13s
|+++++++++ | 16% ~13s
|+++++++++ | 18% ~13s
|++++++++++ | 19% ~13s
|++++++++++ | 20% ~12s
|+++++++++++ | 21% ~12s
|+++++++++++ | 22% ~12s
|++++++++++++ | 23% ~12s
|++++++++++++ | 24% ~12s
|+++++++++++++ | 25% ~12s
|+++++++++++++ | 26% ~11s
|++++++++++++++ | 27% ~11s
|++++++++++++++ | 28% ~11s
|+++++++++++++++ | 29% ~11s
|+++++++++++++++ | 30% ~11s
|++++++++++++++++ | 31% ~11s
|++++++++++++++++ | 32% ~10s
|+++++++++++++++++ | 33% ~10s
|++++++++++++++++++ | 34% ~10s
|++++++++++++++++++ | 35% ~10s
|+++++++++++++++++++ | 36% ~10s
|+++++++++++++++++++ | 37% ~10s
|++++++++++++++++++++ | 38% ~10s
|++++++++++++++++++++ | 39% ~09s
|+++++++++++++++++++++ | 40% ~09s
|+++++++++++++++++++++ | 41% ~09s
|++++++++++++++++++++++ | 42% ~09s
|++++++++++++++++++++++ | 43% ~09s
|+++++++++++++++++++++++ | 44% ~09s
|+++++++++++++++++++++++ | 45% ~08s
|++++++++++++++++++++++++ | 46% ~08s
|++++++++++++++++++++++++ | 47% ~08s
|+++++++++++++++++++++++++ | 48% ~08s
|+++++++++++++++++++++++++ | 49% ~08s
|++++++++++++++++++++++++++ | 51% ~08s
|++++++++++++++++++++++++++ | 52% ~07s
|+++++++++++++++++++++++++++ | 53% ~07s
|+++++++++++++++++++++++++++ | 54% ~07s
|++++++++++++++++++++++++++++ | 55% ~07s
|++++++++++++++++++++++++++++ | 56% ~07s
|+++++++++++++++++++++++++++++ | 57% ~07s
|+++++++++++++++++++++++++++++ | 58% ~06s
|++++++++++++++++++++++++++++++ | 59% ~06s
|++++++++++++++++++++++++++++++ | 60% ~06s
|+++++++++++++++++++++++++++++++ | 61% ~06s
|+++++++++++++++++++++++++++++++ | 62% ~06s
|++++++++++++++++++++++++++++++++ | 63% ~06s
|++++++++++++++++++++++++++++++++ | 64% ~06s
|+++++++++++++++++++++++++++++++++ | 65% ~05s
|+++++++++++++++++++++++++++++++++ | 66% ~05s
|++++++++++++++++++++++++++++++++++ | 67% ~05s
|+++++++++++++++++++++++++++++++++++ | 68% ~05s
|+++++++++++++++++++++++++++++++++++ | 69% ~05s
|++++++++++++++++++++++++++++++++++++ | 70% ~05s
|++++++++++++++++++++++++++++++++++++ | 71% ~04s
|+++++++++++++++++++++++++++++++++++++ | 72% ~04s
|+++++++++++++++++++++++++++++++++++++ | 73% ~04s
|++++++++++++++++++++++++++++++++++++++ | 74% ~04s
|++++++++++++++++++++++++++++++++++++++ | 75% ~04s
|+++++++++++++++++++++++++++++++++++++++ | 76% ~04s
|+++++++++++++++++++++++++++++++++++++++ | 77% ~03s
|++++++++++++++++++++++++++++++++++++++++ | 78% ~03s
|++++++++++++++++++++++++++++++++++++++++ | 79% ~03s
|+++++++++++++++++++++++++++++++++++++++++ | 80% ~03s
|+++++++++++++++++++++++++++++++++++++++++ | 81% ~03s
|++++++++++++++++++++++++++++++++++++++++++ | 82% ~03s
|++++++++++++++++++++++++++++++++++++++++++ | 84% ~03s
|+++++++++++++++++++++++++++++++++++++++++++ | 85% ~02s
|+++++++++++++++++++++++++++++++++++++++++++ | 86% ~02s
|++++++++++++++++++++++++++++++++++++++++++++ | 87% ~02s
|++++++++++++++++++++++++++++++++++++++++++++ | 88% ~02s
|+++++++++++++++++++++++++++++++++++++++++++++ | 89% ~02s
|+++++++++++++++++++++++++++++++++++++++++++++ | 90% ~02s
|++++++++++++++++++++++++++++++++++++++++++++++ | 91% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++ | 92% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 93% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 94% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 96% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 97% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 98% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 99% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=15s
Calculating cluster 2
| | 0 % ~calculating
|+ | 1 % ~02s
|++ | 2 % ~02s
|++ | 3 % ~02s
|+++ | 5 % ~02s
|+++ | 6 % ~02s
|++++ | 7 % ~02s
|++++ | 8 % ~02s
|+++++ | 9 % ~02s
|++++++ | 10% ~02s
|++++++ | 11% ~02s
|+++++++ | 12% ~02s
|+++++++ | 14% ~02s
|++++++++ | 15% ~02s
|++++++++ | 16% ~02s
|+++++++++ | 17% ~02s
|++++++++++ | 18% ~02s
|++++++++++ | 19% ~02s
|+++++++++++ | 20% ~01s
|+++++++++++ | 22% ~01s
|++++++++++++ | 23% ~01s
|++++++++++++ | 24% ~01s
|+++++++++++++ | 25% ~01s
|++++++++++++++ | 26% ~01s
|++++++++++++++ | 27% ~01s
|+++++++++++++++ | 28% ~01s
|+++++++++++++++ | 30% ~01s
|++++++++++++++++ | 31% ~01s
|++++++++++++++++ | 32% ~01s
|+++++++++++++++++ | 33% ~01s
|++++++++++++++++++ | 34% ~01s
|++++++++++++++++++ | 35% ~01s
|+++++++++++++++++++ | 36% ~01s
|+++++++++++++++++++ | 38% ~01s
|++++++++++++++++++++ | 39% ~01s
|++++++++++++++++++++ | 40% ~01s
|+++++++++++++++++++++ | 41% ~01s
|++++++++++++++++++++++ | 42% ~01s
|++++++++++++++++++++++ | 43% ~01s
|+++++++++++++++++++++++ | 44% ~01s
|+++++++++++++++++++++++ | 45% ~01s
|++++++++++++++++++++++++ | 47% ~01s
|++++++++++++++++++++++++ | 48% ~01s
|+++++++++++++++++++++++++ | 49% ~01s
|+++++++++++++++++++++++++ | 50% ~01s
|++++++++++++++++++++++++++ | 51% ~01s
|+++++++++++++++++++++++++++ | 52% ~01s
|+++++++++++++++++++++++++++ | 53% ~01s
|++++++++++++++++++++++++++++ | 55% ~01s
|++++++++++++++++++++++++++++ | 56% ~01s
|+++++++++++++++++++++++++++++ | 57% ~01s
|+++++++++++++++++++++++++++++ | 58% ~01s
|++++++++++++++++++++++++++++++ | 59% ~01s
|+++++++++++++++++++++++++++++++ | 60% ~01s
|+++++++++++++++++++++++++++++++ | 61% ~01s
|++++++++++++++++++++++++++++++++ | 62% ~01s
|++++++++++++++++++++++++++++++++ | 64% ~01s
|+++++++++++++++++++++++++++++++++ | 65% ~01s
|+++++++++++++++++++++++++++++++++ | 66% ~01s
|++++++++++++++++++++++++++++++++++ | 67% ~01s
|+++++++++++++++++++++++++++++++++++ | 68% ~01s
|+++++++++++++++++++++++++++++++++++ | 69% ~01s
|++++++++++++++++++++++++++++++++++++ | 70% ~01s
|++++++++++++++++++++++++++++++++++++ | 72% ~01s
|+++++++++++++++++++++++++++++++++++++ | 73% ~01s
|+++++++++++++++++++++++++++++++++++++ | 74% ~00s
|++++++++++++++++++++++++++++++++++++++ | 75% ~00s
|+++++++++++++++++++++++++++++++++++++++ | 76% ~00s
|+++++++++++++++++++++++++++++++++++++++ | 77% ~00s
|++++++++++++++++++++++++++++++++++++++++ | 78% ~00s
|++++++++++++++++++++++++++++++++++++++++ | 80% ~00s
|+++++++++++++++++++++++++++++++++++++++++ | 81% ~00s
|+++++++++++++++++++++++++++++++++++++++++ | 82% ~00s
|++++++++++++++++++++++++++++++++++++++++++ | 83% ~00s
|+++++++++++++++++++++++++++++++++++++++++++ | 84% ~00s
|+++++++++++++++++++++++++++++++++++++++++++ | 85% ~00s
|++++++++++++++++++++++++++++++++++++++++++++ | 86% ~00s
|++++++++++++++++++++++++++++++++++++++++++++ | 88% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++ | 89% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++ | 90% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++ | 91% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 92% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 93% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 94% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 97% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 98% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 99% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=02s
Calculating cluster 3
| | 0 % ~calculating
|+ | 1 % ~16s
|+ | 2 % ~16s
|++ | 3 % ~16s
|++ | 4 % ~16s
|+++ | 5 % ~16s
|+++ | 6 % ~16s
|++++ | 7 % ~16s
|++++ | 8 % ~15s
|+++++ | 9 % ~15s
|+++++ | 10% ~15s
|++++++ | 11% ~15s
|++++++ | 12% ~15s
|+++++++ | 13% ~15s
|+++++++ | 14% ~14s
|++++++++ | 15% ~14s
|++++++++ | 16% ~14s
|+++++++++ | 17% ~14s
|+++++++++ | 18% ~14s
|++++++++++ | 19% ~14s
|++++++++++ | 20% ~13s
|+++++++++++ | 21% ~13s
|+++++++++++ | 22% ~13s
|++++++++++++ | 23% ~13s
|++++++++++++ | 24% ~13s
|+++++++++++++ | 25% ~13s
|+++++++++++++ | 26% ~12s
|++++++++++++++ | 27% ~12s
|++++++++++++++ | 28% ~12s
|+++++++++++++++ | 29% ~12s
|+++++++++++++++ | 30% ~12s
|++++++++++++++++ | 31% ~12s
|++++++++++++++++ | 32% ~11s
|+++++++++++++++++ | 33% ~11s
|+++++++++++++++++ | 34% ~11s
|++++++++++++++++++ | 35% ~11s
|++++++++++++++++++ | 36% ~11s
|+++++++++++++++++++ | 37% ~11s
|+++++++++++++++++++ | 38% ~10s
|++++++++++++++++++++ | 39% ~10s
|++++++++++++++++++++ | 40% ~10s
|+++++++++++++++++++++ | 41% ~10s
|+++++++++++++++++++++ | 42% ~10s
|++++++++++++++++++++++ | 43% ~09s
|++++++++++++++++++++++ | 44% ~09s
|+++++++++++++++++++++++ | 45% ~09s
|+++++++++++++++++++++++ | 46% ~09s
|++++++++++++++++++++++++ | 47% ~09s
|++++++++++++++++++++++++ | 48% ~09s
|+++++++++++++++++++++++++ | 49% ~08s
|+++++++++++++++++++++++++ | 50% ~08s
|++++++++++++++++++++++++++ | 51% ~08s
|++++++++++++++++++++++++++ | 52% ~08s
|+++++++++++++++++++++++++++ | 53% ~08s
|+++++++++++++++++++++++++++ | 54% ~08s
|++++++++++++++++++++++++++++ | 55% ~08s
|++++++++++++++++++++++++++++ | 56% ~07s
|+++++++++++++++++++++++++++++ | 57% ~07s
|+++++++++++++++++++++++++++++ | 58% ~07s
|++++++++++++++++++++++++++++++ | 59% ~07s
|++++++++++++++++++++++++++++++ | 60% ~07s
|+++++++++++++++++++++++++++++++ | 61% ~07s
|+++++++++++++++++++++++++++++++ | 62% ~06s
|++++++++++++++++++++++++++++++++ | 63% ~06s
|++++++++++++++++++++++++++++++++ | 64% ~06s
|+++++++++++++++++++++++++++++++++ | 65% ~06s
|+++++++++++++++++++++++++++++++++ | 66% ~06s
|++++++++++++++++++++++++++++++++++ | 67% ~06s
|++++++++++++++++++++++++++++++++++ | 68% ~06s
|+++++++++++++++++++++++++++++++++++ | 69% ~05s
|+++++++++++++++++++++++++++++++++++ | 70% ~05s
|++++++++++++++++++++++++++++++++++++ | 71% ~05s
|++++++++++++++++++++++++++++++++++++ | 72% ~05s
|+++++++++++++++++++++++++++++++++++++ | 73% ~05s
|+++++++++++++++++++++++++++++++++++++ | 74% ~04s
|++++++++++++++++++++++++++++++++++++++ | 75% ~04s
|++++++++++++++++++++++++++++++++++++++ | 76% ~04s
|+++++++++++++++++++++++++++++++++++++++ | 77% ~04s
|+++++++++++++++++++++++++++++++++++++++ | 78% ~04s
|++++++++++++++++++++++++++++++++++++++++ | 79% ~04s
|++++++++++++++++++++++++++++++++++++++++ | 80% ~03s
|+++++++++++++++++++++++++++++++++++++++++ | 81% ~03s
|+++++++++++++++++++++++++++++++++++++++++ | 82% ~03s
|++++++++++++++++++++++++++++++++++++++++++ | 83% ~03s
|++++++++++++++++++++++++++++++++++++++++++ | 84% ~03s
|+++++++++++++++++++++++++++++++++++++++++++ | 85% ~03s
|+++++++++++++++++++++++++++++++++++++++++++ | 86% ~02s
|++++++++++++++++++++++++++++++++++++++++++++ | 87% ~02s
|++++++++++++++++++++++++++++++++++++++++++++ | 88% ~02s
|+++++++++++++++++++++++++++++++++++++++++++++ | 89% ~02s
|+++++++++++++++++++++++++++++++++++++++++++++ | 90% ~02s
|++++++++++++++++++++++++++++++++++++++++++++++ | 91% ~02s
|++++++++++++++++++++++++++++++++++++++++++++++ | 92% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 93% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 94% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 96% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 97% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 98% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 99% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=17s
Calculating cluster 4
| | 0 % ~calculating
|++ | 2 % ~01s
|+++ | 5 % ~01s
|++++ | 7 % ~01s
|+++++ | 10% ~01s
|++++++ | 12% ~01s
|++++++++ | 14% ~01s
|+++++++++ | 17% ~01s
|++++++++++ | 19% ~01s
|+++++++++++ | 21% ~01s
|++++++++++++ | 24% ~01s
|++++++++++++++ | 26% ~01s
|+++++++++++++++ | 29% ~01s
|++++++++++++++++ | 31% ~01s
|+++++++++++++++++ | 33% ~01s
|++++++++++++++++++ | 36% ~01s
|++++++++++++++++++++ | 38% ~01s
|+++++++++++++++++++++ | 40% ~01s
|++++++++++++++++++++++ | 43% ~00s
|+++++++++++++++++++++++ | 45% ~00s
|++++++++++++++++++++++++ | 48% ~00s
|+++++++++++++++++++++++++ | 50% ~00s
|+++++++++++++++++++++++++++ | 52% ~00s
|++++++++++++++++++++++++++++ | 55% ~00s
|+++++++++++++++++++++++++++++ | 57% ~00s
|++++++++++++++++++++++++++++++ | 60% ~00s
|+++++++++++++++++++++++++++++++ | 62% ~00s
|+++++++++++++++++++++++++++++++++ | 64% ~00s
|++++++++++++++++++++++++++++++++++ | 67% ~00s
|+++++++++++++++++++++++++++++++++++ | 69% ~00s
|++++++++++++++++++++++++++++++++++++ | 71% ~00s
|+++++++++++++++++++++++++++++++++++++ | 74% ~00s
|+++++++++++++++++++++++++++++++++++++++ | 76% ~00s
|++++++++++++++++++++++++++++++++++++++++ | 79% ~00s
|+++++++++++++++++++++++++++++++++++++++++ | 81% ~00s
|++++++++++++++++++++++++++++++++++++++++++ | 83% ~00s
|+++++++++++++++++++++++++++++++++++++++++++ | 86% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++ | 88% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++ | 90% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 93% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 98% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s
Calculating cluster 5
| | 0 % ~calculating
|+ | 1 % ~18s
|++ | 2 % ~17s
|++ | 3 % ~17s
|+++ | 4 % ~17s
|+++ | 5 % ~16s
|++++ | 7 % ~16s
|++++ | 8 % ~16s
|+++++ | 9 % ~16s
|+++++ | 10% ~16s
|++++++ | 11% ~16s
|++++++ | 12% ~15s
|+++++++ | 13% ~15s
|++++++++ | 14% ~15s
|++++++++ | 15% ~15s
|+++++++++ | 16% ~15s
|+++++++++ | 17% ~15s
|++++++++++ | 18% ~15s
|++++++++++ | 20% ~14s
|+++++++++++ | 21% ~14s
|+++++++++++ | 22% ~14s
|++++++++++++ | 23% ~14s
|++++++++++++ | 24% ~14s
|+++++++++++++ | 25% ~14s
|++++++++++++++ | 26% ~14s
|++++++++++++++ | 27% ~13s
|+++++++++++++++ | 28% ~13s
|+++++++++++++++ | 29% ~13s
|++++++++++++++++ | 30% ~13s
|++++++++++++++++ | 32% ~13s
|+++++++++++++++++ | 33% ~12s
|+++++++++++++++++ | 34% ~12s
|++++++++++++++++++ | 35% ~12s
|++++++++++++++++++ | 36% ~12s
|+++++++++++++++++++ | 37% ~12s
|++++++++++++++++++++ | 38% ~11s
|++++++++++++++++++++ | 39% ~11s
|+++++++++++++++++++++ | 40% ~11s
|+++++++++++++++++++++ | 41% ~11s
|++++++++++++++++++++++ | 42% ~11s
|++++++++++++++++++++++ | 43% ~10s
|+++++++++++++++++++++++ | 45% ~10s
|+++++++++++++++++++++++ | 46% ~10s
|++++++++++++++++++++++++ | 47% ~10s
|++++++++++++++++++++++++ | 48% ~10s
|+++++++++++++++++++++++++ | 49% ~09s
|+++++++++++++++++++++++++ | 50% ~09s
|++++++++++++++++++++++++++ | 51% ~09s
|+++++++++++++++++++++++++++ | 52% ~09s
|+++++++++++++++++++++++++++ | 53% ~09s
|++++++++++++++++++++++++++++ | 54% ~08s
|++++++++++++++++++++++++++++ | 55% ~08s
|+++++++++++++++++++++++++++++ | 57% ~08s
|+++++++++++++++++++++++++++++ | 58% ~08s
|++++++++++++++++++++++++++++++ | 59% ~08s
|++++++++++++++++++++++++++++++ | 60% ~07s
|+++++++++++++++++++++++++++++++ | 61% ~07s
|+++++++++++++++++++++++++++++++ | 62% ~07s
|++++++++++++++++++++++++++++++++ | 63% ~07s
|+++++++++++++++++++++++++++++++++ | 64% ~07s
|+++++++++++++++++++++++++++++++++ | 65% ~06s
|++++++++++++++++++++++++++++++++++ | 66% ~06s
|++++++++++++++++++++++++++++++++++ | 67% ~06s
|+++++++++++++++++++++++++++++++++++ | 68% ~06s
|+++++++++++++++++++++++++++++++++++ | 70% ~06s
|++++++++++++++++++++++++++++++++++++ | 71% ~05s
|++++++++++++++++++++++++++++++++++++ | 72% ~05s
|+++++++++++++++++++++++++++++++++++++ | 73% ~05s
|+++++++++++++++++++++++++++++++++++++ | 74% ~05s
|++++++++++++++++++++++++++++++++++++++ | 75% ~05s
|+++++++++++++++++++++++++++++++++++++++ | 76% ~04s
|+++++++++++++++++++++++++++++++++++++++ | 77% ~04s
|++++++++++++++++++++++++++++++++++++++++ | 78% ~04s
|++++++++++++++++++++++++++++++++++++++++ | 79% ~04s
|+++++++++++++++++++++++++++++++++++++++++ | 80% ~04s
|+++++++++++++++++++++++++++++++++++++++++ | 82% ~03s
|++++++++++++++++++++++++++++++++++++++++++ | 83% ~03s
|++++++++++++++++++++++++++++++++++++++++++ | 84% ~03s
|+++++++++++++++++++++++++++++++++++++++++++ | 85% ~03s
|+++++++++++++++++++++++++++++++++++++++++++ | 86% ~03s
|++++++++++++++++++++++++++++++++++++++++++++ | 87% ~02s
|+++++++++++++++++++++++++++++++++++++++++++++ | 88% ~02s
|+++++++++++++++++++++++++++++++++++++++++++++ | 89% ~02s
|++++++++++++++++++++++++++++++++++++++++++++++ | 90% ~02s
|++++++++++++++++++++++++++++++++++++++++++++++ | 91% ~02s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 92% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 93% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 96% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 97% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 98% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 99% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=18s
Calculating cluster 6
| | 0 % ~calculating
|+ | 1 % ~11s
|++ | 2 % ~11s
|++ | 3 % ~11s
|+++ | 4 % ~11s
|+++ | 5 % ~10s
|++++ | 6 % ~10s
|++++ | 7 % ~10s
|+++++ | 8 % ~10s
|+++++ | 9 % ~10s
|++++++ | 10% ~10s
|++++++ | 11% ~10s
|+++++++ | 12% ~09s
|+++++++ | 13% ~09s
|++++++++ | 14% ~09s
|++++++++ | 15% ~09s
|+++++++++ | 16% ~09s
|+++++++++ | 17% ~09s
|++++++++++ | 18% ~09s
|++++++++++ | 19% ~09s
|+++++++++++ | 20% ~09s
|+++++++++++ | 21% ~09s
|++++++++++++ | 22% ~09s
|++++++++++++ | 23% ~08s
|+++++++++++++ | 24% ~08s
|+++++++++++++ | 26% ~08s
|++++++++++++++ | 27% ~08s
|++++++++++++++ | 28% ~08s
|+++++++++++++++ | 29% ~08s
|+++++++++++++++ | 30% ~08s
|++++++++++++++++ | 31% ~09s
|++++++++++++++++ | 32% ~08s
|+++++++++++++++++ | 33% ~08s
|+++++++++++++++++ | 34% ~08s
|++++++++++++++++++ | 35% ~08s
|++++++++++++++++++ | 36% ~08s
|+++++++++++++++++++ | 37% ~08s
|+++++++++++++++++++ | 38% ~08s
|++++++++++++++++++++ | 39% ~07s
|++++++++++++++++++++ | 40% ~07s
|+++++++++++++++++++++ | 41% ~07s
|+++++++++++++++++++++ | 42% ~07s
|++++++++++++++++++++++ | 43% ~07s
|++++++++++++++++++++++ | 44% ~07s
|+++++++++++++++++++++++ | 45% ~07s
|+++++++++++++++++++++++ | 46% ~06s
|++++++++++++++++++++++++ | 47% ~06s
|++++++++++++++++++++++++ | 48% ~06s
|+++++++++++++++++++++++++ | 49% ~06s
|+++++++++++++++++++++++++ | 50% ~06s
|++++++++++++++++++++++++++ | 51% ~06s
|+++++++++++++++++++++++++++ | 52% ~06s
|+++++++++++++++++++++++++++ | 53% ~06s
|++++++++++++++++++++++++++++ | 54% ~05s
|++++++++++++++++++++++++++++ | 55% ~05s
|+++++++++++++++++++++++++++++ | 56% ~05s
|+++++++++++++++++++++++++++++ | 57% ~05s
|++++++++++++++++++++++++++++++ | 58% ~05s
|++++++++++++++++++++++++++++++ | 59% ~05s
|+++++++++++++++++++++++++++++++ | 60% ~05s
|+++++++++++++++++++++++++++++++ | 61% ~05s
|++++++++++++++++++++++++++++++++ | 62% ~04s
|++++++++++++++++++++++++++++++++ | 63% ~04s
|+++++++++++++++++++++++++++++++++ | 64% ~04s
|+++++++++++++++++++++++++++++++++ | 65% ~04s
|++++++++++++++++++++++++++++++++++ | 66% ~04s
|++++++++++++++++++++++++++++++++++ | 67% ~04s
|+++++++++++++++++++++++++++++++++++ | 68% ~04s
|+++++++++++++++++++++++++++++++++++ | 69% ~04s
|++++++++++++++++++++++++++++++++++++ | 70% ~03s
|++++++++++++++++++++++++++++++++++++ | 71% ~03s
|+++++++++++++++++++++++++++++++++++++ | 72% ~03s
|+++++++++++++++++++++++++++++++++++++ | 73% ~03s
|++++++++++++++++++++++++++++++++++++++ | 74% ~03s
|++++++++++++++++++++++++++++++++++++++ | 76% ~03s
|+++++++++++++++++++++++++++++++++++++++ | 77% ~03s
|+++++++++++++++++++++++++++++++++++++++ | 78% ~03s
|++++++++++++++++++++++++++++++++++++++++ | 79% ~02s
|++++++++++++++++++++++++++++++++++++++++ | 80% ~02s
|+++++++++++++++++++++++++++++++++++++++++ | 81% ~02s
|+++++++++++++++++++++++++++++++++++++++++ | 82% ~02s
|++++++++++++++++++++++++++++++++++++++++++ | 83% ~02s
|++++++++++++++++++++++++++++++++++++++++++ | 84% ~02s
|+++++++++++++++++++++++++++++++++++++++++++ | 85% ~02s
|+++++++++++++++++++++++++++++++++++++++++++ | 86% ~02s
|++++++++++++++++++++++++++++++++++++++++++++ | 87% ~01s
|++++++++++++++++++++++++++++++++++++++++++++ | 88% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++ | 89% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++ | 90% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++ | 91% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++ | 92% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 93% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 94% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 96% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 97% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 98% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 99% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=11s
Calculating cluster 7
| | 0 % ~calculating
|+ | 1 % ~05s
|++ | 2 % ~05s
|++ | 3 % ~05s
|+++ | 4 % ~05s
|+++ | 5 % ~05s
|++++ | 6 % ~05s
|++++ | 7 % ~04s
|+++++ | 8 % ~04s
|+++++ | 9 % ~04s
|++++++ | 10% ~04s
|++++++ | 11% ~04s
|+++++++ | 12% ~04s
|+++++++ | 14% ~04s
|++++++++ | 15% ~04s
|++++++++ | 16% ~04s
|+++++++++ | 17% ~04s
|+++++++++ | 18% ~04s
|++++++++++ | 19% ~04s
|++++++++++ | 20% ~04s
|+++++++++++ | 21% ~04s
|+++++++++++ | 22% ~04s
|++++++++++++ | 23% ~04s
|++++++++++++ | 24% ~04s
|+++++++++++++ | 25% ~04s
|++++++++++++++ | 26% ~04s
|++++++++++++++ | 27% ~04s
|+++++++++++++++ | 28% ~03s
|+++++++++++++++ | 29% ~03s
|++++++++++++++++ | 30% ~03s
|++++++++++++++++ | 31% ~03s
|+++++++++++++++++ | 32% ~03s
|+++++++++++++++++ | 33% ~03s
|++++++++++++++++++ | 34% ~03s
|++++++++++++++++++ | 35% ~03s
|+++++++++++++++++++ | 36% ~03s
|+++++++++++++++++++ | 38% ~03s
|++++++++++++++++++++ | 39% ~03s
|++++++++++++++++++++ | 40% ~03s
|+++++++++++++++++++++ | 41% ~03s
|+++++++++++++++++++++ | 42% ~03s
|++++++++++++++++++++++ | 43% ~03s
|++++++++++++++++++++++ | 44% ~03s
|+++++++++++++++++++++++ | 45% ~03s
|+++++++++++++++++++++++ | 46% ~03s
|++++++++++++++++++++++++ | 47% ~03s
|++++++++++++++++++++++++ | 48% ~03s
|+++++++++++++++++++++++++ | 49% ~03s
|+++++++++++++++++++++++++ | 50% ~02s
|++++++++++++++++++++++++++ | 51% ~02s
|+++++++++++++++++++++++++++ | 52% ~02s
|+++++++++++++++++++++++++++ | 53% ~02s
|++++++++++++++++++++++++++++ | 54% ~02s
|++++++++++++++++++++++++++++ | 55% ~02s
|+++++++++++++++++++++++++++++ | 56% ~02s
|+++++++++++++++++++++++++++++ | 57% ~02s
|++++++++++++++++++++++++++++++ | 58% ~02s
|++++++++++++++++++++++++++++++ | 59% ~02s
|+++++++++++++++++++++++++++++++ | 60% ~02s
|+++++++++++++++++++++++++++++++ | 61% ~02s
|++++++++++++++++++++++++++++++++ | 62% ~02s
|++++++++++++++++++++++++++++++++ | 64% ~02s
|+++++++++++++++++++++++++++++++++ | 65% ~02s
|+++++++++++++++++++++++++++++++++ | 66% ~02s
|++++++++++++++++++++++++++++++++++ | 67% ~02s
|++++++++++++++++++++++++++++++++++ | 68% ~02s
|+++++++++++++++++++++++++++++++++++ | 69% ~02s
|+++++++++++++++++++++++++++++++++++ | 70% ~02s
|++++++++++++++++++++++++++++++++++++ | 71% ~01s
|++++++++++++++++++++++++++++++++++++ | 72% ~01s
|+++++++++++++++++++++++++++++++++++++ | 73% ~01s
|+++++++++++++++++++++++++++++++++++++ | 74% ~01s
|++++++++++++++++++++++++++++++++++++++ | 75% ~01s
|+++++++++++++++++++++++++++++++++++++++ | 76% ~01s
|+++++++++++++++++++++++++++++++++++++++ | 77% ~01s
|++++++++++++++++++++++++++++++++++++++++ | 78% ~01s
|++++++++++++++++++++++++++++++++++++++++ | 79% ~01s
|+++++++++++++++++++++++++++++++++++++++++ | 80% ~01s
|+++++++++++++++++++++++++++++++++++++++++ | 81% ~01s
|++++++++++++++++++++++++++++++++++++++++++ | 82% ~01s
|++++++++++++++++++++++++++++++++++++++++++ | 83% ~01s
|+++++++++++++++++++++++++++++++++++++++++++ | 84% ~01s
|+++++++++++++++++++++++++++++++++++++++++++ | 85% ~01s
|++++++++++++++++++++++++++++++++++++++++++++ | 86% ~01s
|++++++++++++++++++++++++++++++++++++++++++++ | 88% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++ | 89% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++ | 90% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++ | 91% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++ | 92% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 93% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 94% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 96% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 97% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 98% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 99% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=05s
Lets look at the markers to see if they identify cell types
head(ClusterMarkers)
top5 <- ClusterMarkers %>% group_by(cluster) %>% top_n(n=5, wt =avg_log2FC)
DoHeatmap(integrated_seurate, features = top5$gene, size = 3, angle = 90, group.by = "integrated_snn_res.0.4")
We can see cluster 1 and have a huge overlap. Cluster 5 and 6 are also overlapping.
We can look at the markers for each cluster using Gene set analysis or cell type libraries
library(devtools)
Loading required package: usethis
#install_github("wjawaid/enrichR")
library(enrichR)
Welcome to enrichR
Checking connection ...
Enrichr ... Connection is Live!
FlyEnrichr ... Connection is available!
WormEnrichr ... Connection is available!
YeastEnrichr ... Connection is available!
FishEnrichr ... Connection is available!
OxEnrichr ... Connection is available!
setEnrichrSite("Enrichr") # Human genes
Connection changed to https://maayanlab.cloud/Enrichr/
Connection is Live!
# list of all the databases
dbs <- listEnrichrDbs()
dbs
# libaries with cell types
db <- c('CellMarker_Augmented_2021','Azimuth_Cell_Types_2021')
Here is a small fuction to easily check each cluster
checkCelltypes <- function(cluster_num = 0){
clusterX <- ClusterMarkers %>% filter(cluster == cluster_num & avg_log2FC > 0.25)
genes <- clusterX$gene
# the cell type libraries
# get the results for each library
clusterX.cell <- enrichr(genes, databases = db)
# visulize the results
print(plotEnrich(clusterX.cell[[1]], showTerms = 20, numChar = 40, y = "Count", orderBy = "P.value", title = 'CellMarker_Augmented_2021'))
print(plotEnrich(clusterX.cell[[2]], showTerms = 20, numChar = 40, y = "Count", orderBy = "P.value", title = 'Azimuth_Cell_Types_2021'))
}
Run the function for each cluster to see if we can identify cell types
cluster0 <- checkCelltypes(cluster_num = 0)
Uploading data to Enrichr... Done.
Querying CellMarker_Augmented_2021... Done.
Querying Azimuth_Cell_Types_2021... Done.
Parsing results... Done.
Cluster 0 is some kind of immune cells, possibly dendritic cell or macrophage. These are likely the microglia
cluster1 <- checkCelltypes(cluster_num = 1)
Uploading data to Enrichr... Done.
Querying CellMarker_Augmented_2021... Done.
Querying Azimuth_Cell_Types_2021... Done.
Parsing results... Done.
Cluster 1 appears to have oligodendrocytes
cluster2 <- checkCelltypes(cluster_num = 2)
Uploading data to Enrichr... Done.
Querying CellMarker_Augmented_2021... Done.
Querying Azimuth_Cell_Types_2021... Done.
Parsing results... Done.
Cluster 2 is also immune cells, embryonic microglia.
# cluster 3
cluster3 <- checkCelltypes(cluster_num = 3)
# cluster 4
cluster4 <- checkCelltypes(cluster_num = 4)
# cluster 5
cluster5 <- checkCelltypes(cluster_num = 5)
# cluster 6
# cluster 7
We can remove the lists of cell library outputs to clean up our work envirnment
rm(cluster0,cluster1,cluster2)
Looking at known cell type markers is another way to identify cell types
# marker lists
# doi: 10.1002/glia.23767
microglia <- c("IBA1","P2RY12","P2RY13","TREM119", "GPR34","SIGLECH","TREM2",
"CX3CR1","FCRLS","OLFML3","HEXB","TGFBR1", "SALL1","MERTK",
"PROS1")
oligodendrocyte <- c("MBP","MOG","OLIG1","OLIG2","SOX10")
# doi: 10.3390/biom11091361
astrocytes <- c("GFAP","S100B","AQP4","APOE", "SOX9")
Look at the expression of different marker lists
DoHeatmap(integrated_seurate, features = microglia, group.by = "integrated_snn_res.0.4")
Warning in DoHeatmap(integrated_seurate, features = microglia, group.by = "integrated_snn_res.0.4") :
The following features were omitted as they were not found in the scale.data slot for the RNA assay: FCRLS, SIGLECH, TREM119, IBA1
DotPlot(integrated_seurate, features = microglia, group.by = "integrated_snn_res.0.4") + RotatedAxis()
Warning in FetchData.Seurat(object = object, vars = features, cells = cells) :
The following requested variables were not found: IBA1, TREM119, SIGLECH, FCRLS
Idents(integrated_seurate) <- "integrated_snn_res.0.4"
FeaturePlot(integrated_seurate, features = microglia, label = TRUE)
Warning in FetchData.Seurat(object = object, vars = c(dims, "ident", features), :
The following requested variables were not found: IBA1, TREM119, SIGLECH, FCRLS
Cluster 0 has some microglial markers and cluster 2 has lots of microglia markers.
# oligodendrocytes
DoHeatmap(integrated_seurate, features = oligodendrocyte, group.by = "integrated_snn_res.0.4")
DotPlot(integrated_seurate, features = oligodendrocyte, group.by = "integrated_snn_res.0.4") + RotatedAxis()
Cluster 1,3,5 have mature oligodendrocyte markers and cluster 6 has strong oligo lineage markers
# astrocyte
DoHeatmap(integrated_seurate, features = astrocytes, group.by = "integrated_snn_res.0.4")
DotPlot(integrated_seurate, features = astrocytes, group.by = "integrated_snn_res.0.4") + RotatedAxis()
Cluster 5 is the best match for astrocyte markers
Automated cell type annotation
Options scCATCH - scoring CellAssign Bayesian ClustifyR correlation multiple methods Garnett elastic net regression scPred SVM scClassify R weighted kNN https://doi.org/10.15252/msb.20199389
scClassify https://sydneybiox.github.io/scClassify/articles/scClassify.html
# we need to create the reference matrices
#matrices where each row is a gene and each column a cell for a reference dataset and a query dataset
# refrence data
path <- "/Users/rhalenathomas/Documents/Data/scRNAseq/PublicData/"
# we have the data prepared as a Seurat object
ref <- readRDS(paste(path,"Nowakowski_dev_cortext.RDS",sep = ""))
ref
An object of class Seurat
56865 features across 4261 samples within 1 assay
Active assay: RNA (56865 features, 2000 variable features)
1 dimensional reduction calculated: pca
#check the meta data
colnames(ref@meta.data)
[1] "orig.ident" "nCount_RNA" "nFeature_RNA" "WGCNAcluster" "Name"
[6] "Age_in_Weeks" "RegionName" "Laminae" "Area"
The labelled cell types are in WGCNAcluster
We need to make the reference and query objects into a “dgCMatrix” object
combined_dgCMatrix <- as(combined_mat, "dgCMatrix")
Error in asMethod(object) :
'x' slot of invalid type "character" in 'R_dense_as_sparse()'
See the original cell type annotations
table(dgCMat_ref)
Error in unique.default(x, nmax = nmax) :
unique() applies only to vectors
We first perform non-ensemble scClassify
scClassify_res <- scClassify(exprsMat_train = exprsMat_xin_subset,
cellTypes_train = xin_cellTypes,
exprsMat_test = list(wang = exprsMat_wang_subset),
cellTypes_test = list(wang = wang_cellTypes),
tree = "HOPACH",
algorithm = "WKNN",
selectFeatures = c("limma"),
similarity = c("pearson"),
returnList = FALSE,
verbose = FALSE)
We can check the cell type tree generated by the reference data:
scClassify_res$trainRes
Class: scClassifyTrainModel
Model name: training
Feature selection methods: limma
Number of cells in the training data: 674
Number of cell types in the training data: 4
plotCellTypeTree(cellTypeTree(scClassify_res$trainRes))
Warning: Removed 3 rows containing missing values (`geom_text()`).
Check the predictions
table(scClassify_res$testRes$wang$pearson_WKNN_limma$predRes, wang_cellTypes)
wang_cellTypes
acinar alpha beta delta ductal gamma stellate
alpha 0 206 0 0 0 2 0
beta 0 0 118 0 1 0 0
beta_delta_gamma 0 0 0 0 25 0 0
delta 0 0 0 10 0 0 0
gamma 0 0 0 0 0 19 0
unassigned 5 0 0 0 70 0 45